import airbnb
from IPython.display import Image, HTML
HTML('''<script>
code_show=true;
function code_toggle() {
if (code_show){
$('div.input').hide();
} else {
$('div.input').show();
}
code_show = !code_show
}
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit"
value="Click here to toggle on/off the raw code."></form>''')
Image("Melbourne.jpg")
Melbourne, the capital of Victoria, Australia, is a blend of world-class restaurants, museums, art, and unique experience, making it a go-to place for tourists. It also houses main government offices and big universities. In the past years, short rental property listings like AirBnB has been very popular in Melbourne. The added long-term stay feature allowed lessees to rent the property for a longer term. This captured another customer market segment, those who are looking for an affordable place to stay for a longer period of time. For host-to-be and tourists/lessees alike, it will be helpful to understand how the listings vary to aid them in the comparison of existing listings. One difficulty, however, is using a lot of parameters or features in comparing these listings/properties. This is where dimensionality reduction and clustering can be used.
This study utilized information from AirBnB listings to identify which features (both intrinsic and extrinsic) contribute the most to the listing diversity. Correlations among features were also determined. Through the exploratory analysis and dimensionality reduction done, we found out that amenities related to long-term stays are the greatest contributor to listing diversity. This means that most hosts are gearing towards allowing their renters to stay for longer periods.
Price is the main distinguishing feature that segregate listings into clusters based on agglomerative clustering while all other features are more or less the same among clusters. Renters who are looking for either short- or long-term stays can just choose among the clusters with the same characteristics but lowest price to maximize their budget.
Although the 10-cluster model resulting from agglomerative clustering can distinguish clusters based on price of listings, a more granular study can be done by analyzing the subclusters within each cluster. This may give more insights as to the possible features that can further differentiate the clusters.
Airbnb is a platform where people can list and book accommodations around the world. In recent years Airbnb has been reshaping how people find and rent short-term accommodation. Melbourne was one of the early adopters of Airbnb, being one of the top 10 cities for global travelers on Airbnb. With over 18,000 listings as of July 2021, Airbnb proves to dominate short-term accommodations with its diverse range of property types, amenities, and hosts. All this information is easily accessible through their app. Users would be able to communicate with potential renters and vice versa. Both renters and customers have their ratings so that everyone can expect a certain level of quality.
This study aims to determine which features cluster together to discover patterns and similarities in Airbnb listings. To achieve this, we used multiple clustering methods to determine which listings cluster together and how they cluster. This would be beneficial to customers because users can simply filter the listings based on specific features of their choosing rather than browsing through countless Airbnb listings.
Problem Statement: What are the features, both intrinsic and extrinsic, that contribute to clustering of AirBnB listings in Melbourne, Australia?
Secondary Research Questions:
Shown below is the general workflow that was followed in the study (Figure 1).
First, data was retrieved from insideairbnb.com. The dataset was inspected for pre-processing. Unnecessary columns were removed. Missing values were handled either by imputation (less than 25% missing values) or by dropping the columns (more than or equal to 25% missing values). Similar but inconsistent values were corrected using regex.
After data cleaning, exploration of data was done to gather important insights. In preparation for dimensionality reduction, feature engineering was done. A binary representation was created for list-based columns such as amenities and host verifications. Finally, min-max scaling was performed.
For dimensionality reduction, truncated SVD was the chosen technique since the dataset is a mixture of dense and sparse data. Projections of the features were plotted along the singular vectors with the highest contributions to listing variation. For clustering, we explored three clustering methods—agglomerative, Kmeans, and Kmedians. We chose not to include Kmedians in our results because we did not get any significant insights. Results were analyzed to obtain insights that may be beneficial both for the hosts and those who plan to book an Airbnb in Melbourne.
Image("Methodology_updated.png")
airbnb.fig_caption('General workflow for the Study', 1)
Data for this study is retrieved from insideairbnb.com, an independent, non-commercial set of tools and data to explore how Airbnb is really being used in cities around the world. listings.csv.gz was used for this specific study.It contains several information about the listing, host, and availability of the listing.
Information on Melbourne municipalities and neighbourhoods included in each municipality was obtained here.
The description of the variable names in the dataset can be found in the Data Dictionary on Table 1.
airbnb.table_caption('Data Dictionary for the AirBnB Dataset '
'and Melbourne Municipality File', 1)
File Name: listings.csv.gz
| Column | Type | Description |
|---|---|---|
| id | INTEGER | Airbnb's unique indentifier for the listing |
| host_is_superhost | BOOLEAN | Host is a superhost |
| host_listings_count | INTEGER | Number of lisings host has |
| host_verifications | TEXT | Verified channels of the host |
| neighbourhoos_cleansed | TEXT | geocoded using latitude and longitude |
| latitude | INTEGER | World Geodetic System projection |
| longitude | INTEGER | World Geodetic System projection |
| property_type | TEXT | type of property type of the listing |
| room_type | TEXT | type of room of the listing |
| accomodates | INTEGER | Maximum capacity of listing |
| bedrooms | INTEGER | Number of bedrooms |
| beds | INTEGER | Number of beds |
| amenities | TEXT | List of amenities |
| price | INTEGER | Daily price in local currency |
| minimum_nights | INTEGER | minimum number of night stay for the listing |
| has_availability | BOOLEAN | If listing is available |
| availability_30 | INTEGER | Availability of the listing 30 days in the future |
| availability_60 | INTEGER | Availability of the listing 60 days in the future |
| availability_90 | INTEGER | Availability of the listing 90 days in the future |
| availability_365 | INTEGER | Availability of the listing 465 days in the future |
| number_of_reviews | INTEGER | Number of reviews the listing has |
| number_of_reviews_ltm | INTEGER | Number of reviews the listing has in the last 12 months |
| review_scores_rating | INTEGER | Average review score of listing |
| calculated_host_listings_count | INTEGER | Listings the host has in the current scrape |
| reviews_per_month | INTEGER | |
| number_of_baths | INTEGER | Number of baths the listing has |
| bath_type | TEXT | Type of bath the listings has |
File Name: melbourne_municipalities.csv
| Column | Type | Description |
|---|---|---|
| municipality | TEXT | Different municipalitites under Melbourne |
| neighbourhood_cleansed | TEXT | Neighbourhood names in each municipality |
# import libraries
import pandas as pd
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
The dataset was loaded as a dataframe with 18605 rows/listings and 74 features. Data cleaning was done to ensure consistency of the formats of entries in some columns. Unnecessary columns were also removed. After cleaning and filtering to include only the main property types (apartment, house, townhouse, condominium, and guesthouse), the resulting dataframe had 16959 unique listings.
# Data Loading
listings = pd.read_csv('/mnt/data/public/insideairbnb/data.insideairbnb.com/'
'australia/vic/melbourne/2021-07-05/data/'
'listings.csv.gz',
compression="gzip")
#Cleaning
df = airbnb.clean_data(listings)
pd.options.mode.chained_assignment = None # default='warn'
amenities, with_amenities = airbnb.clean_amenities(df)
verif_df, expanded_df = airbnb.clean_verifications(with_amenities)
Data was explored to determine if there are columns with missing values. Table 2 shows the summary of missing values per column. For the purpose of this study, we set a threshold of 25% for missing values. All columns with more than 25% missing values were automatically removed. The resulting summary of missing values after removal of some columns is shown in Table 3.
airbnb.table_caption('Missing Values per Column', 2)
# Check percent missing per column
missing = airbnb.missing(df)
missing.head(15)
# Drop columns with >= 25% missing
clean_df = airbnb.drop_missing(df, missing)
airbnb.table_caption('Missing Values per Column after Dropping Columns', 3)
new_missing = airbnb.missing(clean_df)
new_missing.head(10)
To handle the remaining missing values, imputation was done. For categorical variables, missing values were filled with the most frequent value in that column. On the other hand, median value for the column was used to replace missing numerical values in that column. Table 4 is the working dataset after cleaning and imputation. Only 29 columns remained after cleaning.
airbnb.table_caption('Working Dataset after Cleaning and Imputation', 4)
final_df = airbnb.impute(clean_df, new_missing)
final_df.head(5)
final_df.describe()
To gain some insights about AirBnB listings in Melbourne, data such as number of listings, location, price, availability, and even information about the host were examined.
Neighbourhood
As of July 2021, 30% of listings are located in Melbourne City (Figure 2). The City of Melbourne is an area in Victoria, Australia that is found at the central area of Melbourne. Melbourne, Australia's second largest city, is the capital of Victoria, This is where the Victorian government is located. Headquarters of many companies, government and non-government agencies are also found here (About Melbourne - City of Melbourne, n.d.). For these reasons, City of Melbourne is a very strategic location for AirBnB listings.
airbnb.plot_neighborhood_count(final_df)
airbnb.fig_caption('Number of listings in Different Neighbourhoods '
'in Melbourne', 2)
Property Type
Apartments and houses are the most listed property types in Melbourne (Figure 3). One of the possible reasons why apartments are attractive investment for property listing sites like AirBnB is the number and diversity of amenities that can be offered. Good quality apartments have shared areas and unique features like gyms, pools, and parking spaces. This is advantageous for investors because they don't have to spend for separate amenities per listing. For the travellers or renters, it is a great opportunity to socialize and meet new people.
airbnb.plot_property_count(final_df)
airbnb.fig_caption('Number of listings based on Property Type', 3)
Room Type
Most of the listings are entire homes/apartments (10,830 listings) and only 0.02% are of a shared or hotel room (Figure 4). AirBnB has a "long-term rental" feature wherein hosts can rent-out their place for at least twenty-eight (28) days (Airbnb Launched Long-Term Rentals, 2020). This is a good opportunity to increase occupancy rate of the listing especially during non-peak periods for travellers. For the lessee, booking AirBnB properties instead of the traditional lease may be even more affordable. Entire homes or apartments are the most appropriate property types for long-term lease.
airbnb.plot_room_count(final_df)
airbnb.fig_caption('Number of listings based on Room Type', 4)
Per Neighbourhood
Figure 5 shows that Yarra Ranges has the highest average price of 318 AUD per night. This could be due to Yarra Ranges having a scenic National Park thus being a hotspot for tourists to take a vacation. In addition, it also houses community and arts centers which may be of interest to tourists (Yarra Ranges Council, n.d.).
On the other hand, Brimbank has the cheapest average price of 96 AUD per night (Figure 6).
airbnb.plot_avg_price_neighbour(final_df, bracket='top')
airbnb.fig_caption('Listings with the Highest Average Price', 5)
airbnb.plot_avg_price_neighbour(final_df, bracket='low')
airbnb.fig_caption('Listings with the Lowest Average Price', 6)
Property and Room Type
Apartments and houses have a large price range compared to others (Figure 7). This can be attributed to diversity in number and type of amenities that can be offered as well as the size of properties listed. On the other hand, the more expensive room types are entire home/apartment and private rooms (Figure 8).
airbnb.plot_prop_price(final_df)
airbnb.fig_caption('Box Plots of Prices for Different Property Types', 7)
airbnb.plot_room_price(final_df)
airbnb.fig_caption('Box Plots of Prices for Different Room Types', 8)
Parking is the most common ammenity found in the listings in Melbourne (Figure 9). Other top amenities include tv, kitchen, essentials, wifi, washer, heating, hangers, and airconditioning. These top amenities are important features of a property that would accommodate long term stays. This just shows that most of the listings are equipped for longer rental terms.
airbnb.plot_amenities(amenities)
airbnb.fig_caption('Frequency of Amenities Found in Listings in Melbourne', 9)
Rental business should be profitable. It is therefore necessary to look at the occupancy of existing listings in Melbourne. Figure 10 shows that on average, listings in Yarra have the lowest availability for the next year. Yarra may be a good neighbourhood to consider when planning to put up a property for AirBnB. On the other hand, property in Melton have generally the highest number of days available for the next 365 days.
airbnb.plot_availability(final_df)
airbnb.fig_caption('Availability (Days) of Listings '
'in the Next 30, 60, 90, and 365 '
'days', 10)
For people who are interested to enlist properties in AirBnB, one of an important factor is the ease of business. There are many ways by which hosts are verified. Over 90% of the hosts in AirBnB use their email as means of verification on the platform (Figure 11).
airbnb.plot_verifications(verif_df)
airbnb.fig_caption('Means of Verification of Hosts of Listings in Melbourne',
11)
Another important factor to look at is the number of listings per host. This may shed light as to how many listings can one host manage without sacrificing quality. Hosts in Melbourne usually have more than 1 listing (Figure 12). One certain host has 408 listings. This certain host could possibly be a hotel, condominium or apartment owner who rents individual rooms of their property.
airbnb.listing_per_host(final_df)
airbnb.fig_caption('Box Plot of the Number of Listings per Host', 12)
Lastly, it is important to consider the general review scores of listings in Melbourne. AirBnB listings in Melbourne generally have a higher reviews with the median being at 4.275 (Figure 13). This indicates that customers are generally satisfied with the quality of the property and their experience when they booked that property. There are many property management companies in Melbourne whose main function is to manage the property on behalf of the host. They know the current trends in short-term rentals including what the tourists or lessees usually look for. This may be one of the factors for such high ratings.
airbnb.rating_host(final_df)
airbnb.fig_caption('Box Plot of the Review Score Ratings of '
'Listings in Melbourne', 13)
After data cleaning, feature engineering was done. A binary bag-of-words representation was created for list-based columns such as amenities and host verifications. The resulting dataset after one-hot encoding of categorical variables and binary bag-of-words representation is shown in Table 5.
airbnb.table_caption('Dataset after One-hot Encoding of Categorical '
'Variables and Binary Representation of Amenities '
'and Host Verifications', 5)
# One-hot encoding of all categorical variables
ohe_df = airbnb.ohe(final_df)
ohe_amenities = airbnb.ohe_amenities(ohe_df, amenities)
ohe_all = airbnb.ohe_verifications(ohe_amenities, verif_df)
ohe_all.head()
Finally, minmax scaling was done. There were a total of 267 features to analyze.
scaled_df, df_id = airbnb.scaling(ohe_all)
Truncated SVD was performed for dimensionality reduction. This specific technique was chosen since the dataset is a mixture of dense and sparse data.
X_new, exp_var, sv_comp = airbnb.truncated_svd(scaled_df)
Figure 14 below shows the variance explained by each SV as well as the cumulative variance. If we choose 90% as threshold for the explained variance, we will reduce the number of features from 267 to 75 SVs.
sv_cutoff = airbnb.plot_variance(exp_var)
airbnb.fig_caption('Individual and Cumulative Variance of the SVs', 14)
df_sv = airbnb.sv_comp_df(scaled_df, ohe_all, sv_comp)
airbnb.plot_SVcomponents(df_sv, sv_comp)
airbnb.fig_caption('Weights of the Top Features for SV1, SV2, '
'SV3, and SV4', 15)
Weight of the top features for the first four SVs are shown in Figure 15. SV1 explains about 61.5% of the variance observed in AirBnB listings in Melbourne. Features related to host verifications (host_verifications_phone, host_verifications_email), availability (has_availability_t), amenities (amenities_kitchen, amenities_essentials, amenities_smoke_alarm, amenities_wifi, amenities_washer, amenities_heating, amenities_hangers, amenities_long term stays allowed, amenities_tv, amenities_iron, and amenities_airconditioning), and review_score_rating contribute the most to this SV. It seems that this SV explains features that are related to long term stays of renters.
SV2 explains about 4.3% of the variation in the listings. For SV2, the top features in terms of weight are mostly related to amenities: amenities_refrigerator, amenities_oven, amenities_microwave, amenities_stove, amenities_dishes silverware, amenities_cooking basics, amenities_bed linens, amenities_dishwasher, amenities_patio or balcony, amenities_extra pillows blankets, amenities_hotwater, and amenities_coffee maker. The features that mostly contribute to SV2 are amenities needed for cooking.
About 1.9% of the variation in the listings can be explained by SV3. Mostly features that are intrinsic to the listings, such as room type, property type, bath type, as well as the neighborhood in which the listings are located contributed most to SV3. Lastly, SV4 can be explained by features which are mostly about future availability of the listings as well as those related to host verifications.
In general, the feature that contributes most to the diversity of listings in Melbourne are amenities.
Pairwise (by SV) scatterplots of listings and projection of features along these SVs are shown in Figures 16 to 18.
airbnb.plot_svd_zoomed(X_new, df_sv, sv_comp, 1, 2)
airbnb.fig_caption('Scatterplot of Listings plotted along SV1 and SV2 (left)'
' and Projection of the Original Features along the '
'same SVs (right)', 16)
airbnb.plot_svd(X_new, df_sv, sv_comp, 2, 3)
airbnb.fig_caption('Scatterplot of Listings plotted along SV2 and SV3 (left)'
' and Projection of the Original Features along the '
'same SVs (right)', 17)
airbnb.plot_svd(X_new, df_sv, sv_comp, 1, 3)
airbnb.fig_caption('Scatterplot of Listings plotted along SV1 and SV3 (left)'
' and Projection of the Original Features along the '
'same SVs (right)', 18)
To determine the propensity of AirBnB listings in Melbourne to cluster based on their overall features, both hierarchical clustering (Agglomerative using Ward's Method) and representative-based clustering (Kmeans and KMedians) were performed.
model = airbnb.agglo_cluster(X_new)
linkage_matrix = airbnb.plot_agglo(X_new, model)
airbnb.fig_caption('Dendogram showing the clustering of AirBnB listings'
' in Melbourne ', 19)
y_pred_300 = airbnb.agglo_fcluster(X_new, linkage_matrix, 300)
y_pred_300 = airbnb.agglo_fcluster(X_new, linkage_matrix, 300)
airbnb.fig_caption('Scatterplot showing the clustering of AirBnB listings'
' in Melbourne at Delta = 300', 20)
y_pred_250 = airbnb.agglo_fcluster(X_new, linkage_matrix, 250)
airbnb.fig_caption('Scatterplot showing the clustering of AirBnB listings'
' in Melbourne at Delta = 250', 21)
y_pred_150 = airbnb.agglo_fcluster(X_new, linkage_matrix, 150)
airbnb.fig_caption('Scatterplot showing the clustering of AirBnB listings'
' in Melbourne at Delta = 150', 22)
from sklearn.cluster import KMeans
res_kmeans = airbnb.cluster_range(X_new[:, :75], KMeans(random_state=143), 2, 11)
#Plot SV1 and SV2
airbnb.plot_clusters(X_new[:, :75], res_kmeans['ys'], 0, 1)
airbnb.fig_caption('Scatterplot showing the clustering of AirBnB listings'
' at different values of k', 23)
#Plot SV2 and SV3
airbnb.plot_clusters(X_new[:, :75], res_kmeans['ys'], 1, 2)
airbnb.fig_caption('Scatterplot showing the clustering of AirBnB listings'
' at different values of k', 24)
airbnb.plot_num_clusters(res_kmeans)
airbnb.fig_caption('Plots of Different Internal Validation'
' Criteria', 25)
The scatterplots at different values of k both along SV1 & SV2 and SV2 & SV3, showed that the best clustering is at k=3 (Figures 23 and 24). The points in the same cluster are compact and balanced. Also, the clustering is parsimonious. This is further confirmed based on the internal validation criteria such as Sum of squares distances to centroids (Inertia), Calinski-Harabasz index, and Silhouette coefficient (Figure 25).
# Data with cluster labels
labelled_df = airbnb.df_with_labels(df_id, scaled_df, y_pred_150, y_pred_250, y_pred_300)
features_with_clusters = labelled_df.merge(final_df, how='left',
left_on='id', right_on='id')
features = ['id','neighbourhood_cleansed', 'property_type', 'room_type',
'amenities', 'price_x', 'cluster_kmeans', 'cluster_agg_d150',
'cluster_agg_d250', 'cluster_agg_d300']
clusters_df = features_with_clusters[features]
clusters_df.head()
The distribution of listing prices per cluster for both KMeans and Agglomerative Clustering were compared (Figures 26). Based on the figures, we can clearly see that the 3-cluster model resulting from KMeans Clustering cannot differentiate listings based on prices. On the other hand, the 10-cluster model from Agglomerative Clustering shows some differences in the average listings prices. Table 6 shows the average price per cluster based on the 10-cluster model. Cluster 1 has the lowest average price (92.68 AUD) while cluster 3 has the highest average price (212.73 AUD).
Analysis of other features such as location (neighborhood), property type and room type showed almost the same features available in the clusters which means these cannot distinguish one cluster from another.
We also performed an analysis on the listings using KMedians clustering but our results were not significant enough. We determined this when we saw no clear clusters being formed at any given K, so we decided to omit it from our results and discussion.
From this point forward, analysis of the amenities will be done based on the 10-cluster model from Agglomerative clustering.
import matplotlib.pyplot as plt
import seaborn as sns
fig, ax = plt.subplots(2, 1, figsize=(15, 25))
for n, i in enumerate(['cluster_agg_d150', 'cluster_kmeans']):
sns.boxplot(x=clusters_df[i], y=clusters_df['price_x'], ax=ax[n], showfliers=False)
airbnb.fig_caption('Boxplots showing the distribution of listing prices '
'in Different Clusters Using Agglomerative Clustering '
'(Upper) and KMeans Clustering (Lower)', 26)
airbnb.table_caption('Average Price of AirBnB Listings per Cluster', 6)
price_per_cluster = pd.DataFrame(clusters_df.groupby('cluster_agg_d150')['price_x'].mean())
price_per_cluster = price_per_cluster.rename(columns = {'price_x': 'price ($)'})
price_per_cluster
cols = list(range(1,80))
amenities_only = pd.concat([with_amenities['id'], with_amenities[cols]], axis=1)
cluster_amenities = clusters_df.merge(amenities_only, how='left',
left_on='id', right_on='id')
cols_to_drop = ['neighbourhood_cleansed', 'property_type', 'room_type',
'amenities', 'price_x', 'cluster_kmeans',
'cluster_agg_d250', 'cluster_agg_d300']
cluster_amenities = cluster_amenities.drop(cols_to_drop, axis=1)
Analysis of the top 10 amenities of the clusters revealed that the 10-cluster model from agglomerative clustering resulted to more or less the same types of amenities: wifi, kitchen, essentials, parking, washer, heating, long term stays allowed, tv, hangers, and airconditioning. The clusters just have very minimal differences in terms of the arrangement of the most common amenities. Tables showing the most frequent amenities for some clusters are shown below.
Our clustering analysis showed that price is the main contributor as to how the listings clustered together.
airbnb.table_caption('Most Common Amenities in Cluster 1', 7)
df1 = cluster_amenities[cluster_amenities['cluster_agg_d150']==1]
amenities_count = pd.DataFrame(df1.stack().value_counts())
top_amenities = amenities_count.head(11).reset_index().rename(columns={'index':'amenities',
0:'count'})
top_amenities.iloc[1:]
airbnb.table_caption('Most Common Amenities in Cluster 4', 8)
df4 = cluster_amenities[cluster_amenities['cluster_agg_d150']==4]
amenities_count = pd.DataFrame(df4.stack().value_counts())
top_amenities = amenities_count.head(11).reset_index().rename(columns={'index':'amenities',
0:'count'})
top_amenities = top_amenities.drop(2)
top_amenities
airbnb.table_caption('Most Common Amenities in Cluster 9', 9)
df9 = cluster_amenities[cluster_amenities['cluster_agg_d150']==9]
amenities_count = pd.DataFrame(df9.stack().value_counts())
top_amenities = amenities_count.head(6).reset_index().rename(columns={'index':'amenities',
0:'count'})
top_amenities = top_amenities.drop(1)
top_amenities
Aside from the United States and European countries, Australia is one destination worth considering for tourists. Short term rentals like AirBnB are very popular especially in areas like Melbourne.
Through this exploratory analysis of AirBnB listings in Melbourne, we were able to determine that amenities like wifi, kitchen, essentials, parking, washer, heating, long term stays allowed, tv, hangers, and airconditioning are the overall most important features contributing to the variation in the listings in Melbourne. These features as somewhat related to long term stays which could mean that most hosts are gearing towards allowing their renters to stay for longer periods.
Clustering results showed that price is the main distinguishing feature that segregate listings into clusters. All other features such as neighborhoods, property types, and room types are well-represented in the clusters, which means they cannot discriminate among clusters. Knowing that there are differences in prices among clusters, renters who are looking for either short- or long-term stays can just choose among the clusters with the same characteristics but lowest price to maximize their budget.
This exploratory data analysis is just a preliminary step in understanding AirBnB listings. There are certain challenges encountered during the conduct of the study. One of these is the integrity of the data obtained from insideairbnb.com. We noticed some listings with a nightly price of 1 AUD, which seems to be impossible. There are also hosts with zero total listings but are still included in the dataset.
Possible future works include listings in other areas of Australia to have a more comprehensive analysis of the Airbnb market in the country. Another possibility is to get the geographical polygons of the different neighborhoods in Melbourne to see better which neighborhoods cluster together based on different features. Exploring the subclusters of each cluster could also be helpful for future works to provide a more granular view of different features of the subclusters. Customers would be able to use this to identify which listings would be the perfect destination for their getaway. Renters may take advantage of this to modify their listings so they may target a specific target market. Lastly, Airbnb can use this information to further help their recommender system maximize the engagement within their app.